Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[controller] do not delay RT topic deletion if the store is no longer hybrid #1234

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

arjun4084346
Copy link
Contributor

@arjun4084346 arjun4084346 commented Oct 10, 2024

Summary, imperative, start upper case, don't end with a period

When TopicCleanupService tries to delete RT topics, it does not have to wait for all the corresponding VT topics to get deleted if all the versions of the store has changed from hybrid to batch. If all the versions are batch, it can be safely assumed that no VT topic is consuming from that RT topic.

Resolves #XXX

How was this PR tested?

added a unit test in TestTopicCleanupService

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

Copy link
Contributor

@sushantmane sushantmane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add E2E test that verifies RT topic marked for deletion is

  1. it is not deleted when there are hybrid versions
  2. deleted once all hybrid versions are removed

@arjun4084346 arjun4084346 force-pushed the rtTopicDeletion branch 7 times, most recently from 9ec9fce to ddf8586 Compare October 24, 2024 18:47
Comment on lines 426 to 427
if (version.getHybridStoreConfig() != null && VersionStatus.isVersionErrored(versionStatus)
&& VersionStatus.isVersionKilled(versionStatus)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get this part.. so only when a hybrid version is errored and killed, we think it's hybrid?

private void safeDeleteRTTopic(String clusterName, String storeName, Store store) {
private void safeDeleteRTTopic(String clusterName, String storeName) {
Store store = getHelixVeniceClusterResources(clusterName).getStoreMetadataRepository().getStore(storeName);
if (canDeleteRTTopic(clusterName, storeName, store)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the conditions for RT topic deletion exist in multiple places, e.g. L3509, canDeleteRTTopic. is it possible to merge these conditions into a single function for simplicity?

if (canDeleteRTTopic(clusterName, storeName, store)) {
deleteRTTopic(clusterName, storeName);
} else {
LOGGER.warn("Topic deletion for topic: {} is delayed.", Version.composeRealTimeTopic(storeName));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it easy to log the reason for delayed deletion?

return !Version.containsHybridVersion(store.getVersions()) && canDeleteRTTopic(clusterName, storeName);
}

public boolean canDeleteRTTopic(String clusterName, String storeName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename the function somehow to reflect the fact that we are querying the child controllers for their opinions?

Comment on lines +3565 to +3568
truncateKafkaTopic(rtTopicToDelete);
for (ControllerClient controllerClient: controllerClientMap.values()) {
controllerClient.deleteKafkaTopic(rtTopicToDelete);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these lines are duplicated later. Can we extract them into a small helper function?

@@ -135,10 +122,26 @@ public TopicCleanupService(
} else {
this.sourceOfTruthPubSubAdminAdapter = null;
}

if (!this.admin.isParent()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is keyword this here necessary? My understanding for our code convention is that this is used when there is ambiguity or cases it has to be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's main reason to move the child controller api calls into constructor?

@@ -235,49 +238,46 @@ Admin getAdmin() {
* If version topic deletion takes more than certain time it refreshes the entire topic list and start deleting from RT topics again.
*/
void cleanupVeniceTopics() {
// todo - instead of adding all topics into the same queue we must simply use separate queues for RT and nonRT
Copy link
Contributor

@lluwm lluwm Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, todo -> "TODO" so that IDE can highlight it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants